\section{Our technique}
\label{sec-our-technique}

We define a \clos{}-based protocol for accessing and augmenting a
lexical environment.  This protocol is defined and implemented in the
\trucler{} library.

\subsection{Querying the environment}
\label{querying-the-environment}

A language processor calls one of the query functions in order to
determine the nature of a language element, depending on the position
in source code of that language element.  All these functions are
generic, and they all take a \texttt{client} parameter and an
\texttt{environment} parameter.  Methods defined by \trucler{} do not
specialize to the \texttt{client} parameter.  Client code should pass
an object specific to the application as a value of that parameter,
and it can supply methods specialized to the class of this object, for
the purpose of extending or overriding default behavior.  The
\texttt{environment} parameter is an object of the type used by the
implementation that \trucler{} is configured for.  Functions that are
used to query a particular \emph{name} have an additional parameter
for this purpose.

The following query functions are defined by \trucler{}.  Each one
returns an instance of a class that allows the language processor to
determine the exact nature of the language element (\texttt{nil} is
returned if there is no definition for the element), for example by
using the instance in a call to a generic function:

\begin{itemize}
\item \texttt{describe-variable}.  This function returns an instance
  of a class that distinguishes lexical variables, special variables,
  constant variables, and symbol macros.
\item \texttt{describe-function}.  This function returns an instance
  of a class that distinguishes global functions, local functions, and
  macros.
\item \texttt{describe-block}.
\item \texttt{describe-tag}.
\item \texttt{describe-optimize}.
\item \texttt{describe-declarations}.  This function is called by the
  language processor in order to determine the declaration identifiers
  of \texttt{declaration} proclamations.
\end{itemize}

\subsection{Augmenting the environment}

A language processor calls one of the augmentation functions in order
to define a lexical environment within the scope of a declaration or a
definition encountered in source code.  All these functions take at
least a \texttt{client} parameter and an \texttt{environment}
parameter just like the query functions, and they all return a new
lexical environment, augmented according to the function being called.

The following functions are called by the language processor when a
local definition is encountered, and they return a new environment that
includes the new definition:

\begin{itemize}
\item \texttt{add-lexical-variable}.
\item \texttt{add-special-variable}.
\item \texttt{add-local-symbol-macro}.
\item \texttt{add-local-function}.
\item \texttt{add-local-macro}.
\item \texttt{add-block}.
\item \texttt{add-tag}.
\end{itemize}

\noindent
The following functions are called by the language processor as the
result of a local declaration that restricts an existing local
function or variable:

\begin{itemize}
\item \texttt{add-variable-type}.
\item \texttt{add-variable-ignore}.
\item \texttt{add-variable-dynamic-extent}.
\item \texttt{add-function-type}.
\item \texttt{add-function-ignore}.
\item \texttt{add-function-dynamic-extent}.
\end{itemize}

\noindent
The following functions are called by the language processor as the
result of a local \texttt{optimize} declaration.

\begin{itemize}
\item \texttt{add-inline}.
\item \texttt{add-speed}.
\item \texttt{add-compilation-speed}.
\item \texttt{add-debug}.
\item \texttt{add-safety}.
\item \texttt{add-space}.
\end{itemize}

\subsection{Restricting the environment}

Recall that the description of the function \texttt{enclose} in
section 8.5 of CLtL2 mentions that the consequences are undefined if
the \textit{lambda-expression} argument contains references to
entities in the environment that are not available at compile time,
such as lexically visible bindings of variable and functions,
\texttt{go} tags, or block names.

As a service to a robust implementation of the \texttt{enclose}
function, the \trucler{} library provides a function named
\texttt{restrict-for-macrolet-expander} that takes an environment as
an argument and returns an environment that contains only entities
available at compile time.  Using this function, the implementation of
\texttt{enclose} can return a function that will signal an error if
the \textit{lambda-expression} argument contains unavailable
references.

\subsection{The reference implementation}

\trucler{} supports some existing \commonlisp{} implementations as
described in \refSec{trucler-supported-implementations}, but it also
comes with a \emph{reference implementation} that can be used by a new
\commonlisp{} implementation that does not have its own representation
of lexical environments.  The reference implementation is used by
\sicl{}%
\footnote{https://github.com/robert-strandh/SICL} for instance.

In the reference implementation, a lexical environment is represented
as a standard object containing a slot for each type of description to
be returned by a query function as described in
\refSec{querying-the-environment}.  Each slot contains a list of
descriptions ordered from innermost to outermost.  A query function
merely returns the first item on the list that matches the name that
was passed as an argument to the query function.  As a direct
consequence of this representation, there is no performance penalty in
the query functions, due to the fact that a new environment is created
for every call to an augmentation function.

In order to create new objects such as environments or descriptions,
we use a technique that we call \emph{quasi cloning}.  A generic
function named \texttt{cloning-information} is called with the
original object as an argument.  This function then returns a list of
pairs.  The first element of the pair is a slot initialization argument
for the class of the object and the second element of the pair is the
name of a slot reader for the same slot.  This information is then
used to access slots in the original object and to pass that
information as an initialization argument to \texttt{make-instance}.
We call it quasi cloning, because some new value is prepended to the
initialization arguments so that the copy is like the original, except
for one slot.

The advantage of quasi cloning is that \trucler{} does not need to know
the right class to instantiate.  It creates an instance of the same
class as the original object, and that class can be defined by client
code.  Client code must define a method on
\texttt{cloning-information}, but this generic function uses the
\texttt{append} method combination, so that only slots defined by
client code need to be mentioned in that method.

Occasionally, an entirely new instance of some class must be created,
rather than being obtained by quasi cloning an existing instance.
This situation occurs when information about a new item such as a
local variable or a local function must be used to augment an existing
environment.  To allow \trucler{} to create an instance of a class
that has been determined by client code, \trucler{} first calls what
we call a \emph{factory} function.  This function takes the
\texttt{client} object and returns the class metaobject to
instantiate.  For example, to create an instance of a class that
describes lexical variables, \trucler{} calls the function
\texttt{lexical-variable-description-class}, passing it the
\texttt{client} object supplied by client code.  The default method on
this generic function returns the default class used by the reference
implementation, but client code that needs additional information
about lexical variables may create a subclass of the default class,
and a method on \texttt{lexical-variable-description-class} that
returns this subclass.

\subsection{Supported \commonlisp{} implementations}
\label{trucler-supported-implementations}

\trucler{} currently provides support for \sbcl{} and \ccl{}.
Contributions for other \commonlisp{} implementations are welcome.
With these implementations, it is possible to write code walkers that
are portable across different \commonlisp{} implementations.  In
particular, a \cleavir{}-based compiler can compile source code for
any of the supported implementations.

\subsection{Examples}

In this section, we show some examples of how \trucler{} can be used
by a code walker.  All examples are from \cleavir{} used in the
\sicl{} compiler.  We have simplified the examples compared to the
actual code, in order to avoid too much clutter.  For example, we have
omitted the handling of error situations and restarts.

The part of \cleavir{} that uses \trucler{} is the phase that converts
a \emph{concrete syntax tree} (CST) to an \emph{abstract syntax tree}
(AST).  A concrete syntax tree can be thought of as a \commonlisp{}
expression but where each sub-expression is wrapped in a standard
object that holds additional information such as source location.  At
the core of this compilation phase is the generic function
\texttt{convert-cst}.  For each class of description objects that
\trucler{} can return, this generic function has a method specialized
to that class.

The function \texttt{convert-cst} is called by a top-level function
\texttt{convert} that determines the structure of the expression to
convert and calls the appropriate \trucler{} query function and then
invokes \texttt{convert-cst} with the object returned by \trucler{}.

The method specialized to \texttt{local-macro-description} looks like
this:

{\small
\begin{verbatim}
(defmethod convert-cst
    (client
     cst
     (info trucler:local-macro-description)
     environment)
  (let* ((expander (trucler:expander info))
         (expanded-form
          (expand-macro expander cst environment))
         (expanded-cst
          (cst:reconstruct expanded-form cst client)))
    (setf (cst:source expanded-cst) (cst:source cst))
    (with-preserved-toplevel-ness
      (convert client expanded-cst environment))))
\end{verbatim}
}

\noindent
As we can see, this method is specialized to the \trucler{} class
\texttt{local-macro-description}, and no other parameter is
specialized.  The code calls the accessor \texttt{expander} on the
\texttt{info} parameter, which returns the macro expander associated
with the local macro.  

The function \texttt{expand-macro} is responsible for taking into
account \texttt{*macroexpand-hook*} as the \commonlisp{} standard
requires.  The call to \texttt{reconstruct} has to do with preserving
source information in the expanded form.  The essence of the body is
the call to \texttt{convert} which converts the expanded form (wrapped
in a concrete syntax tree).

The next example shows how the environment is augmented when a
\texttt{block} special form is converted:

{\small
\begin{verbatim}
(defmethod convert-special
    (client
     (symbol (eql 'block))
     cst
     environment)
  (cst:db origin (block-cst name-cst . body-cst) cst
    (declare (ignore block-cst))
    (let ((name (cst:raw name-cst)))
      (let* ((ast (cleavir-ast:make-ast
                   'cleavir-ast:block-ast))
             (new-environment
               (trucler:add-block
                client environment name ast)))
        (setf (cleavir-ast:body-ast ast)
              (process-progn
               client
               (convert-sequence
                client body-cst new-environment)
               environment))
        ast))))
\end{verbatim}
}

\noindent
In the example above, \texttt{cst:db} is a version of the standard
\commonlisp{} operator \texttt{destructuring-bind}, that is used to
destructure concrete syntax trees as opposed to ordinary \commonlisp{}
source expressions.  The last argument to \texttt{add-block} is an
optional argument that \trucler{} calls \texttt{identity} and that
\trucler{} stores in the environment, associated with the block
information.  The nature of the object supplied is entirely determined
by client code.  In our case, we supply an abstract syntax tree that
represents the \texttt{block} special form so that when an associated
\texttt{return-from} is found, the two abstract syntax trees can be
connected.

The essence of the method body is the call to the function
named \texttt{convert-sequence} which converts the body of the
\texttt{block} form in the original environment augmented with
information about the \texttt{block} form.
